DERI&UPM: Pushing Corpus Based Relatedness to Similarity: Shared Task System Description
نویسندگان
چکیده
In this paper, we describe our system submitted for the semantic textual similarity (STS) task at SemEval 2012. We implemented two approaches to calculate the degree of similarity between two sentences. First approach combines corpus-based semantic relatedness measure over the whole sentence with the knowledge-based semantic similarity scores obtained for the words falling under the same syntactic roles in both the sentences. We fed all these scores as features to machine learning models to obtain a single score giving the degree of similarity of the sentences. Linear Regression and Bagging models were used for this purpose. We used Explicit Semantic Analysis (ESA) as the corpus-based semantic relatedness measure. For the knowledgebased semantic similarity between words, a modified WordNet based Lin measure was used. Second approach uses a bipartite based method over the WordNet based Lin measure, without any modification. This paper shows a significant improvement in calculating the semantic similarity between sentences by the fusion of the knowledge-based similarity measure and the corpus-based relatedness measure against corpus based measure taken alone.
منابع مشابه
CogALex-V Shared Task: LexNET - Integrated Path-based and Distributional Method for the Identification of Semantic Relations
We present a submission to the CogALex 2016 shared task on the corpus-based identification of semantic relations, using LexNET (Shwartz and Dagan, 2016), an integrated path-based and distributional method for semantic relation classification. The reported results in the shared task bring this submission to the third place on subtask 1 (word relatedness), and the first place on subtask 2 (semant...
متن کاملUPM system for WMT 2012
This paper describes the UPM system for the Spanish-English translation task at the NAACL 2012 workshop on statistical machine translation. This system is based on Moses. We have used all available free corpora, cleaning and deleting some repetitions. In this paper, we also propose a technique for selecting the sentences for tuning the system. This technique is based on the similarity with the ...
متن کاملFBK-TR: SVM for Semantic Relatedeness and Corpus Patterns for RTE
This paper reports the description and scores of our system, FBK-TR, which participated at the SemEval 2014 task #1 "Evaluation of Compositional Distributional Semantic Models on Full Sentences through Semantic Relatedness and Entailment". The system consists of two parts: one for computing semantic relatedness, based on SVM, and the other for identifying the entailment values on the basis of b...
متن کاملPresentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures
Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...
متن کاملPresentation of an efficient automatic short answer grading model based on combination of pseudo relevance feedback and semantic relatedness measures
Automatic short answer grading (ASAG) is the automated process of assessing answers based on natural language using computation methods and machine learning algorithms. Development of large-scale smart education systems on one hand and the importance of assessment as a key factor in the learning process and its confronted challenges, on the other hand, have significantly increased the need for ...
متن کامل